perm filename THESIS[RDG,DBL] blob sn#605138 filedate 1981-08-06 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00008 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	Things I want to do for a thesis:
C00003 00003	Conversation with Mark Stefik
C00008 00004	Conversation with Mike Genesereth
C00009 00005		Thoughts - 15 May
C00013 00006		Ideas - as of 8-June
C00016 00007		Analogy
C00018 00008		What is Expertise?
C00031 ENDMK
C⊗;
Things I want to do for a thesis:

Basically, goal-driven: A real application program.

Must:
	(Strong) use of RLL

Desired:
	Impact on AI
	Respectable in its domain. (Ie not a toy.)

Considerations:
	Representational vs Learning (vs ...)
	Computational (Reasoning) vs Communicational

Ideas:
	Communication - see page 4.
Conversation with Mark Stefik

(In)Validate a thesis

Problem: Represent (& communicate) a theory

Translate theory from, say, Predicate Calculus, into some operational model.
	Subsumes - Automatic Programming
[Discussion 24-June]
1. Design an experiment to discriminate among theories -- ie the input would
	be a set of theories, and the output said experiment.
  Issues: How to represent a "Theory", together with its ramifications, ...
2. Issues of planning -

Mailed to STEFIK @ PARC & @ SUMEX -- 18:31 20-June
Mark:
	The time has come to start thinking about an eventual thesis topic.
My ideal for said thesis would be some application task which
(1) strongly involved RLL and (2) was of a sufficiently complex nature it could
potentially make a real contribution both to that domain, and to AI in general;
perhaps by pushing at some AI-ish concept (eg use of analogy or meta-level
reasoning, or the appication of large quantities of diverse knowledge).

Do you think the field of Molecular Genetics offers such possibilities?
In your mind, are there major open problems in this domain, waiting only someone's
diligent hard work for solution? 
In what part of this domain would you now probe, if you'd life to life over?
Would they require an almost expert-level knowledge of the domain on my part?
(Are there currently compositories of vast amounts of relevant knowledge
which could be tapped -- such as the library of E Coli strains, and related genes?)

Could we get together sometime in the near future (say early next week) to discuss
this?  Thanks,
	Russ

∂21-Jun-80  1026	Stefik at SUMEX-AIM 	Re: Advice to a Young Scientist   
Date: 21 Jun 1980 1022-PDT
From: Stefik at SUMEX-AIM
Subject: Re: Advice to a Young Scientist  
To:   RDG at SU-AI

In response to your message sent 20 Jun 1980 1831-PDT

Russ,
	I'll be around on Tuesday and Thursday after the VLSI class at
3:30.  Perhaps one of those times would be suitable -- I guess Thursday
might be best for me since there may be things that need doing right away
after the first day of class.  Mark

PS.  You may want to explore the general set of projects in HPP, and list
some of the pros and cons for them; also it would be good to narrow down
your area of interest in terms of AI topics.

PPS.  By way of open issues in MOLGEN, I just submitted two papers to the
AI journal;  drafts of these are available as working papers in the trminal
room as HPP-80-12 & 13.  Both papers have sections sketching areas of future
research in planning.
-------

∂Mailed to STEFIK@SUMEX 12:35 23-June
Ok - I'll plan to see you after class on Thursday then.
(I'll be sitting in on that class, at least for the first few meetings.)
	Thanks
Russ
Conversation with Mike Genesereth

Notations for Communication

Defn: "Notation" relevant to communication, "Representation" for Reasoning

Task: Given model of "hearer" and question, deduce best way of communicating
	this info to him -- i.e. graphs, or predicate calculus, or ...

      Close to pragmatics
	Thoughts - 15 May

	Analogy
One important facet for any intelligent system is the ability to find
and use analogies.
Understanding analogies is, therefore, one of
the main goals of the EURISKO project (see ?).
Especially useful will be deriving new facts in one domain based on
known derivations found in some other domain.
Towards this end we (DBL & I) have devised the game plan which follows.

DBL claims many analogies due to common type of derivation --
	eg same type of proof

Analogy : Inheritance :: Heuristic : Algorithm
(ie use inheritance to infer "guaranteed" things; but analogy
is useful for weaker conjectures...  Sounds like relation between
algorithm and heuristic.)

Task:
My approach will begin by encoding facts about several domains,
in distinct KBs.  These breadth facts should serve as source for
interesting analogies -- or at least that's the hope.

Initially, we will declare (units) U and V to be
"analogous" if there is some slot, S, such that A:S and B:S are EQUAL.
(Provided neither of those values were trivial.)
Eventually we will improve upon this superficial method.
Eg, consider binary relations R such
	R( U:S, V:S )
holds. Consider R beings
=, Subset, ImpliedBy, Generalization ...
and finally Analogous.

Initial Set of Domains
1) Oil Spill - basically because I've already spent a good deal of time creating
	this KB (for the Expert System workshop Summer '80).
2) Planning, esp over the domain of Automatic Programming -- two reasons:
	i) closely related to planning stuff I'm doing with Rand, anyway.
	ii) an interesting area needed for eventual bootstrapping;
		and as such, something I've already spent some time on in RLL.
3) Programming itself -- see two above.

Eventually we should be able to put in facts from MYCIN, and other EMYCIN systems;
and other existing Expert domains.  (medical, legal, ...)
	- realize the basic work will have already been done...
	Ideas - as of 8-June
----
The problem in AI:
	Understanding understanding.
----

1. Decomposition of representation - into
    into:
	Inference processes in general
	Inheritance Methods (like inference)
		(eg Defaults)
	Physical organization
	Matching processes - for retrieval
	Storage of facts
	Caching values, vs recomputing
    Attack:
	An AGE-like system, which takes specification of task, returning code
	  i) by big switch
	  ii) by mixing and matching
	  iii) by "deep understanding" of domain

2. Analogy - esp as relates to abstraction
    Realization that this seems everywhere:
	Natural language  ("bachelor bear")
	Circumscription  (leaving only essential facts)
	Naive physics (Hayes) -- related to Novak & deKleer (Envisonage)
	Mike's sense: Given X, claim X is analogous to Y if ∃ subtheory
	   of Y which is <isomorphic> to a subtheory of X.
	   [Subtheory means fewer axioms, and possibly change of names]
    Relates to other theories of analogy:
	Winston	   - elaborate feature matching
	Hayes-Roth - feature-set matching
    Attack:
	Include one or more KBs, complete with abstractions
	(better: have a theory for each; and a means for forming
	 subtheories)
	Input new "fact", domain, ...
	This is "matched" against the subtheories, and best case is returned
	[best?]

3. Molecular genetics
	Need domain which is more fully understood
	EG function of protein which comes from nucleotide sequence

4. EURISKO itself
	What to encode - for what task?
	Analogy

Uses:
  1. To better understand a (new) concept -
	as in "fill-in additional slots"
  2. Concise description, for communication  (nice "shorthand")
  3. As source for explanation  (BGB's comment)

Basic Types of Analogy:
  1. Simple feature matching
    a. Wiston style
    b. Hayes-Roth style (with sets of features - as in prototypes)
  2. Common abstraction (MRG) - find & match sub-theories
  3. Common perspective (rdg)
	- viewing both X and Y as Z's, they are essentially the same thing
  4. Lindley's idea of Interfield connections - "causally related"

Common Examples:
  1. Metaphor - "John is like a bear."
  2. Interfield connections - "Current flow is like water flow."
  3. Standard NL
    i) Lakoff's ideas, of `Time is precious commodity'
    ii) Most terms are extendable - "bachelor bear"
  4. Strained - "a house is like a car, IN THAT both have radiators"
------

Scenarios for using Analogy-Finding-Box:
(in Use #1:)
  Preliminary Step:  Each of  n fields
	What is Expertise?
(22-Aug-80)
What does a Nobel Lauraete has which seperates him from the rest
of the populus?".
Answering "Expertise in his field" only begs the question 
-- what exactly is expertise?
It must begin by including a tremendous amount of knowledge (whatever that is).
But raw data, by itself, is clearly insufficient.
(Otherwise we would be forced to conclude every
advanced text is an expert.)
The information must be in a usable form; and the expert must know how to use it
for the task at hand.
This necessitates knowledge describing when each bit of knowledge is applicable;
and at a higher level, when to use a particular sub-model over another.
A bona fide expert must have this type of information, in abundance,
to navigate within the space of solution methodologies.

In the last decade numerous AI tools and methods have evolved
to deal with these issues.
The entire rule formalism may be viewed as a large part of this attempt;
before that, predicate calculus, and theories of semantics apparently had
similar goals.
However, each technique seem applicable only to its particular domain,
(such as infectious diseases,) and only
for its specific type of task (eg diagnosis).
Attempts to use the same domain data for another task, say teaching, has often
exposed the gaps and limitation of this cut at the data.

To reiterate, expertise must contain (1) a wealth of raw information and
(2) procedures which dictate how to utilize this information,
and when to use each part.
Furthermore, experience has shown the overall routines, used for the particular
application, are quite dependent on the actual type of task;
this dependency seems to have a major impact on the nature of the stored data
itself.

The apparent conclusion is that expertise must be relative to both the
domain, and the particular type of task.
This is not surprising -- ingenious clinicians may be very poor at deciding which
experiment to try next; and it is the exception, not the rule, that
great experimenter are also good teachers.

A major difficulty encountered in designing expert systems is the incredibly
ill-defined nature of this task.
We do not have even a toehold,
(much less a comprehensive theory) of what is meant by expertise.
This paper will attempt to address the requirements we place on any system
(human or machine) before we will claim it is an expert; and then comment by
outlining a system, Homonculus, which should be capable of building up such
"experts"; being itself an expert in that `expert-construction' task.
!	Definitions
Let me begin by defining some terms.

Expertise:
Given some task, <Y>, an expert in domain <X> is able to build/access
a working model of <X> applicable to <Y>, and use it effectively.

<X> might be Medicine, Chemistry, Molecular Genetics, ...
<Y> might be Diagnoses, Experiment Planning, Teaching, Conjecturing, Data Input ...

Working Model:
A set of facts, together with rules pertaining to applicability.
(Like RWW's LS-pairs)
Capable of simulating `real world' - eg Symbolic Execution [modulo Frame Problem]

Model:

Theory:

Simulation Structure:
!	Theories, Working Models, and the Like

Assumption: The structure of the rep'n depends mostly on the domain.
However, what fills it in (eg grain size) is a function of the task.
Similarly the basic calling sequence of the various routines is essentially
invariant to a given task.
However these acutal functions with vary with each application.

[Is this the inference engine vs data arrangement?]

The structure of the representations may have to change from domain to domain, but 
the actual data filling out this structure is highly dependent on the task;
there is a similar need to adjust the accessing/utilizing routines.

Even this is insufficient: where are things like creativity? How does one decide
the new directions in which to explore. For this one has to develope a "feel",
a gestalt, for this full domain. This meta-theory of science, ...

In this light, let us readdress questions which philosphers have already considered.
What is a Theory? How does it relate to a model? When it a model applicable?
What constitutes an effective
predictive power? When are metaphors, and analogies, pertanant? 

What are possible uses of a model? -- Predication, throw out "obviously wrong"
conjectures [Gerlenter], quickly conjecture reasonable ideas, simulate.

Meta-issues: When is a proposed model adequate? Can a model encode its own short-
comings?
What does it take to validate (test) a theory (or model)? When is it totally
discredited, and when can it be "hacked up" to account for this new phenomena?
[See Velikovsky.]
How to decide what counts as reinforcements?
When does it have real explanation capability? [How vs Why]

Remaining issues
Conjecture (eg Circumscription) vs Laws
Cause/Effect ← Frame problem
Different perspectives of a theory -- or distinct models of some fact.

What is a theory?
Cupernicous, Darwin, <plate techtonics>; Kuhn ← self descriptive
Methodologies, Esthetics,
!Scenario:
User:	I want a KB which knows about Programming Constructs.
Homon:	For what task?
User:	Program verification.
Homon:	Tell me about Program Verification.
User:	It tries to determine whether executing a particular listing of source code
	will achieve a specified goal.
Homon:	Oh, so this task is essentially validation.
User:	Yes.
Homon:	Describe the input which you are trying to validate.
User:	Source code, in the language LISP.
Homon:	Describe the expected results. (Is there a function which ...)
User:	Ies.

Relevant Parts:	Causal sub-model
		Symptons/cures
		Fuzziness


Note Homonculus has own expertise - in knowledge extraction/input; and knowledge
application.
Ie it has own scenario which it follows to extract this data.

This is due to Homonculus's Data Acquisition Frame.
!	Homonculus
Large set of tasks - 
	Types of Tasks
<In all - causality, meta-planning, ... >

Teaching, diagnosing, assisting, conjecturing, validation, ...

Hierarchy - so under teaching is: Cut&Dry, Overlay, Differential, ...

Relevant parts for Teaching:
	Derivation of student model

Minimal level understanding of numerous domains
!Note RLL just handles Data, and its organization. This system would be capable
of dealing with whole range of things we call expertise:
	Data and its organization
	Utilization of this information
	Means for converting to other tasks
Homonculus has an extensible collection of task-types, 
as well

If P knows the domain of chemistry,
	and the ideas of instruction,
  THEN P can teach chemistry.

If P knows the domain of infectious diseases,
	and the ideas of diagnoses,
  THEN P can diagnose infectious diseases.

If P knows the domain of theories/models,
	and the ideas of knowledge acquisition,
  THEN P can acquire (input in usable form) theories/models.

Not AGE's decision tree.
Not EMYCIN.
Not RWW's MetaFol - but still uses his Simulation Structures.
Not Shank's scripts - as extensible, and arbitrary depth before bottoming out.
Not Psi's program generation (although this is closest.) It works with higher 
level input, less guided  -- mre about expert programs, only.
(It could have Program Generation Model, which executes)